Automate DynamoDB Migration with AWS Serverless Services

Dania Refaie
7 min readJun 18, 2023

Abstract

This article explores DynamoDB cross-region migration using AWS Lambda and Step Functions. It discusses the challenges of manually migrating a large number of tables and highlights the role of automation. A practical demonstration showcases the use of Lambda functions and Step Functions to automate backup and restore operations. The migration process enables efficient replication with minimal restoration time.

DynamoDB Cross-region Migration

Amazon DynamoDB is a fully managed NoSQL database service offered by AWS. It ensures fast and predictable performance with seamless scalability, high availability, and durability. DynamoDB organizes data into schema-less tables, consisting of items with different attributes. Tables require a primary key, which can be a partition key or a combination of a partition key and a sort key. DynamoDB supports automatic scaling, distributing data across servers for scalability and fault tolerance. It provides data security, encryption at rest, backups, and point-in-time recovery. With DynamoDB, you can perform fast key-value lookups, rich querying with secondary indexes, and atomic counter increments. It seamlessly integrates with other AWS services like Lambda, S3, and CloudWatch. DynamoDB is a flexible NoSQL database solution that handles high-scale applications while offering scalability, availability, and durability, all managed by AWS.

DynamoDB cross-region migration refers to the process of replicating or moving data from one AWS region to another in DynamoDB. It enables you to create a copy of your DynamoDB tables in a different region for various reasons such as disaster recovery, data locality, or global data distribution. There are several reasons why you might need to migrate DynamoDB tables across regions:

  1. Disaster Recovery: Cross-region migration helps in establishing a disaster recovery strategy.
  2. Data Locality and Latency: Migrating DynamoDB tables to a region closer to your users can reduce latency and improve performance.
  3. Compliance and Data Residency: Some regulatory requirements or compliance standards may require data to be stored within specific geographical regions.
  4. Global Data Distribution: If you have a globally distributed application, migrating DynamoDB tables across regions allows you to distribute your data and workload geographically.

DynamoDB On-demand Backups

DynamoDB cross-region replication using DynamoDB On-Demand Backup involves creating an on-demand backup of a DynamoDB table in the source region and restoring that backup in the destination region to replicate the table. To initiate the cross-region replication, you begin by creating an on-demand backup of the DynamoDB table in the source region. This backup captures the entire table, including its data and indexes, at a specific point in time. Once the backup is created, you can restore the DynamoDB table from that backup by specifying the target table name, the backup ARN (Amazon Resource Name) of the backup you want to restore, and the target region.

Once the restoration is complete, applications can be directed to use the new DynamoDB table in the destination region. Ongoing replication between the regions can be achieved through other mechanisms such as DynamoDB Streams or periodic data updates, depending on the desired replication frequency.

Migrating a Large Number of DynamoDB Tables

Migrating a large number of DynamoDB tables using the manual backup approach can be a challenging task. Manually creating backups and restoring them for each table can be time-consuming, error-prone, and require significant effort. Some of the difficulties involved include:

  1. Time and effort: Creating manual backups and restoring them individually for a large number of DynamoDB tables can be a time-consuming and labor-intensive process. It involves repetitive steps and can be prone to human error.
  2. Complexity and coordination: Managing the migration of multiple tables manually requires careful coordination and tracking. Ensuring that the backup and restore operations are performed correctly for each table can become challenging as the number of tables increases.
  3. Downtime and impact on operations: Restoring tables from backups usually involves some downtime during the migration process. Coordinating the migration without causing disruptions to ongoing operations can be challenging, especially for applications with high availability requirements.

Automate and Orchestrate DynamoDB Cross-region Migration

To address the challenge of migrating a large number of DynamoDB tables, you can leverage AWS Lambda functions and AWS Step Functions to automate and orchestrate the backup and restore functionality. AWS Lambda functions allow you to encapsulate the backup and restore operations for DynamoDB tables. You can create separate Lambda functions for backup and restore processes, which will handle the respective tasks.

AWS Step Functions, on the other hand, provide a serverless workflow service for orchestrating multiple Lambda functions. With Step Functions, you can define the workflow for the migration process, specifying the sequence of steps and the logic for each operation.

To implement this solution, you would define a Step Functions state machine that outlines the workflow for backing up and restoring each DynamoDB table. The state machine defines the order of execution, error handling mechanisms, and any parallel processing if necessary.

DynamoDB Cross-region Migration Demo

In this article, we will go through the migration steps required to replicate data from DynamoDB tables in N. Virginia region to Ireland region. Firstly five DynamoDB tables will be created in N. Virginia region for this demo purpose. Then three lambda functions will be created and prepared to execute the following operations:

  • BackupTable Lambda function: will take the name of the table as an input and create an on-demand backup for it.
  • CheckBackupStatus Lambda function: will take the ARN of the backup and check its status and will return a boolean value corresponding to the backup status.
  • RestoreTable Lambda function: will take the ARN of the backup and restore it on the specified region.

These lambda functions will be orchestrated using the Step function which will start with the Pass step that takes an array of the DynamoDB tables names in the source region and pass it to the next step which is a parallel step that executes the following logic for each table passed in the tables array:

  1. Create a backup for the table using BackupTable lambda function.
  2. Check the status of the backup using CheckBackupStatus lambda function.
  3. According to the output of the previous step, if the backup is complete, trigger the RestoreTable lambda function.
  4. If the backup status is still pending, wait for one minute and recheck the status.

DynamoDB tables creation in the N. Virginia

Five DynamoDB tables are created for this demo purpose

BackupTable Lambda function

The BackupTable lambda function uses Node 16.x with the required DynamoDB permission to create the backup.

const AWS = require('aws-sdk');

exports.handler = async (event) => {
const tableName = event;
const dynamodb = new AWS.DynamoDB();
console.log(`Backingup table ${tableName}`);
const response = await dynamodb.createBackup({
BackupName: tableName,
TableName: tableName
}).promise();
const { BackupDetails } = response;
return {
backupArn: BackupDetails.BackupArn,
tableName
};
};

CheckBackupStatus Lambda function

The Backup ARN is passed from the previous step and used to check the status of the backup.

const AWS = require('aws-sdk');

exports.handler = async (event) => {
const { backupArn, tableName } = event;
const dynamodb = new AWS.DynamoDB();
console.log(`Check Backup status for table ${backupArn}`);
const { BackupDescription } = await dynamodb.describeBackup({
BackupArn: backupArn
}).promise();
const isReady = BackupDescription.BackupDetails.BackupStatus === 'AVAILABLE'? 'true': 'false';
return {
isReady,
tableName,
backupArn
}
};

RestoreTable Lambda function

After the backup is fully complete, this Lambda function restores the table in the Ireland region as specified in the DynamoDB configuration below.

const AWS = require('aws-sdk');

exports.handler = async (event) => {
const dynamodb = new AWS.DynamoDB({
region: 'eu-west-1'
});
const { tableName, backupArn } = event;
const result = await dynamodb.restoreTableFromBackup({
BackupArn: backupArn,
TargetTableName: tableName,
SSESpecificationOverride: {
Enabled: true,
SSEType: 'KMS'
}
}).promise();
return result;
};

Trigger-Backup Step function

The step function takes an array of DynamoDB table names as input and passes them to the subsequent stages as shown below.

{
"Comment": "State machine to trigger parallel DynamoDB backup jobs",
"StartAt": "PassDynamoDBTables",
"States": {
"PassDynamoDBTables": {
"Type": "Pass",
"Parameters": {
"dynamodbTables": ["Table1", "Table2", "Table3", "Table4", "Table5"]
},
"Next": "ParallelBackups"
},
"ParallelBackups": {
"Type": "Map",
"ItemsPath": "$.dynamodbTables",
"Iterator": {
"StartAt": "TriggerBackup",
"States": {
"TriggerBackup": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:{{account_id}}:function:BackupTable",
"Next": "CheckStatus"
},
"CheckStatus": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:{{account_id}}:function:CheckBackupStatus",
"Next": "StatusEvaluater"
},
"StatusEvaluater": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.isReady",
"StringEquals": "true",
"Next": "TriggerRestore"
},
{
"Variable": "$.isReady",
"StringEquals": "false",
"Next": "Wait1Minute"
}
]
},
"Wait1Minute": {
"Type": "Wait",
"Seconds": 60,
"Next": "CheckStatus"
},
"TriggerRestore": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:{{account_id}}:function:RestoreTable",
"End": true
}
}
},
"End": true
}
}
}

Backup/Restore execution

After the Step function is created and provided with the required permission to trigger the lambda function, the backup process and restore take their places as shown below

Upon navigating to the Ireland region, we can observe that the tables have been migrated simultaneously with minimal restoration time.

--

--